منابع مشابه
On using spoken data in corpus lexicography
Corpora are increasingly used in lexicography in order to provide good evidence for dictionary statements: the inclusion of spoken data in corpora is generally considered important. This paper raises some issues connected with the use of spoken data. It points out that the extensive differences between written and spoken language have great consequences for dictionary-making. It argues that the...
متن کاملMining Word Senses from Text for Corpus-Based Lexicography
This paper discusses the problem of automated lexicography. In the corpus-based approach, a lexicographer has to manually group contexts of a target word into clusters in order to identify word senses. When a large number of the contexts is given, this process becomes a tedious and time-consuming task. To overcome this problem, we propose an efficient technique based on unsupervised clustering....
متن کاملMultivariate methods in the corpus-based lexicography A study of synonymy in Finnish
ion in comparison to Arppe (2006), the six distinct person/number features (e.g. FIRST PERSON SINGULAR, FIRST PERSON PLURAL, SECOND PERSON SINGULAR, and so on) were decomposed as a matrix of three person features (FIRST vs. SECOND vs. THIRD) and two number features (SINGULAR vs. PLURAL). In all, 108 contextual variables in the corpora turned out to exceed a minimum threshold of 24 occurrences (...
متن کاملMultivariate Methods in Corpus-Based Lexicography: A Study of Synonymy in Finnish
The purpose of this paper is to present a case study of how multivariate statistical methods such as polytomous logistic regression can be adapted to discover and analyze the wide and complex range of linguistic factors which both influence and interact in the selection and usage of sets of more than two near-synonyms. The results reported in this paper are a follow-up of Arppe (2006), and a pr...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Lexikos
سال: 2010
ISSN: 1684-4904,1684-4904
DOI: 10.4314/lex.v13i1.51383